NLPR at TREC 2005: HARD Experiments

نویسندگان

Bibo Lv

Jun Zhao

چکیده

1 Overview It is the third time that Chinese Information Processing Group of NLPR takes part in TREC. In the past, we participated in Novelty track and Robust track, in which we had evaluated our two key notions: Window-based Retrieval Algorithm and Result Emerging Strategy [1][2]. This year we focus on investigating the significance of relevance feedback, so HARD track is our best choice. HARD2005 is very different from that in the past two years. Firstly, Metadata is removed from topic description so that the topic description in HARD is the same as that of Robust track. Secondly, passage retrieval is cancelled this year. The paper introduces our work on HARD Track in TREC 2005, mainly (1) we propose a new feature selection method for query expansion in relevance feedback; (2) we adopt some query expansion methods. Our paper is organized as follows. Section 2 introduces our system, a new term selection algorithm for query expansion, and our clarification forms. Section 3 presents our query expansion methods. In section 4 experimental results are given, and finally we conclude our work in section 5. As to the retrieval model, Lemur toolkit developed by UMASS and CMU includes six different retrieval models [3]. In order to facilitate our work, we use Okapi BM25 [4][5] as the retrieval model, which is based on the probability model of Robertson and Sparck Jones. The formula is described as follow: 3 1 2 3 (1) (1) | | k qtf k tf avdl dl w k Q K tf k qtf avdl dl + + − = + ⋅ ⋅ + + + (1) 1 ((1) /) K k b b dl avdl = − + ⋅

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NLPR at TREC 2004: Robust Experiments

It is the second time that the Chinese Information Processing group of NLPR participates in TREC. In the past, we have investigated the use of two key technologies: Window-based weighting method and Semantic Tree Model for query expansion, with success, to tasks in novelty and robust tracks. We focused on the Robust Retrieval Track at this year’s conference. Based on the previous IR architectur...

متن کامل

NLPR at TREC 2003: Novelty and Robust

It is the first time that the Chinese Information Processing group of NLPR participates in TREC. Our goal in this year is to test our IR system and get some experience about the TREC evaluation. So, we select two retrieval tasks: Novelty Track and Robust Track. We build a new IR system based on two key technologies: Window-based weighting method and Semantic Tree Model for query expansion. In t...

متن کامل

Meiji University HARD and Robust Track Experiments

متن کامل

The Lowlands' TREC Experiments 2005

This paper describes our participation to the TREC HARD track (High Accuracy Retrieval of Documents) and the TREC Enterprise track. The main goal of our HARD participation is the development and evaluation of so-called query profiles: Short summaries of the retrieved results that enable the user to perform more focused search, for instance by zooming in on a particular time period. The main goa...

متن کامل

NLPR in TREC 2007 Blog Track

LIU Kang, WANG Gen, HAN Xianpei, ZHAO Jun National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China, 100080 Abstract This paper describes the opinion retrieval system for TREC 2007 blog track. This paper focuses on two components of the system. One component is important content block detection component which is used to extract blog conten...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

NLPR at TREC 2005: HARD Experiments

نویسندگان

چکیده

منابع مشابه

NLPR at TREC 2004: Robust Experiments

NLPR at TREC 2003: Novelty and Robust

Meiji University HARD and Robust Track Experiments

The Lowlands' TREC Experiments 2005

NLPR in TREC 2007 Blog Track

عنوان ژورنال:

اشتراک گذاری